Tolerating faults while maximizing reward

نویسندگان

  • Hakan Aydin
  • Rami G. Melhem
  • Daniel Mossé
چکیده

The imprecise computation (IC) model is a general scheduling framework, capable of expressing the precision vs. timeliness trade-off involved in many current real-time applications. In that model, each task comprises mandatory and optional parts. While allowing greater scheduling flexibility, mandatory parts in the IC model have still hard deadlines and hence they must be completed before the task’s deadline even in the presence of faults. In this paper, we address fault tolerant (FT) scheduling issues for IC tasks. First, we propose two recovery schemes, namely Immediate Recovery and Delayed Recovery. These schemes can be readily applied to provide fault tolerance to mandatory parts by scheduling optional parts appropriately for recovery operations. After deriving the necessary and sufficient conditions for both schemes, we consider the FT-Optimality problem, that is, generating a schedule which is FT and whose reward is maximum among all possible FT schedules. For Immediate Recovery, we present and prove correctness of an efficient FT-Optimal scheduling algorithm. For Delayed Recovery, we show that the FT-Optimality problem is NP-Hard, thus is intractable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tolerating Faults in a Mesh with a Row of Spare Nodes

Bruck, J., R. Cypher and C.-T. Ho, Tolerating faults in a mesh with a row ofspare nodes, Theoretical Computer Science 128 (1994) 241-252. We present an efficient method for tolerating faults in a two-dimensional mesh architecture. Our approach is based on adding spare components (nodes) and extra links (edges) such that the resulting architecture can be reconfigured as a mesh in the presence of...

متن کامل

Self-stabilization of Byzantine Protocols

Awareness of the need for robustness in distributed systems increases as distributed systems become integral parts of day-to-day systems. Self-stabilizing while tolerating ongoing Byzantine faults are wishful properties of a distributed system. Many distributed tasks (e.g. clock synchronization) possess e cient non-stabilizing solutions tolerating Byzantine faults or conversely non-Byzantine bu...

متن کامل

Axo: Tolerating Delay Faults in Real-Time Systems

We address delay faults: faults that cause a software component to take more time for completing an action than a given deadline. Such faults are particularly of interest in realtime mission-critical control applications that use general-purpose computing platforms to compute setpoints. A violation of realtime constraints associated with setpoints can result in failure. Existing benign and Byza...

متن کامل

Fault-Diagnosis in a Multiple-Path Interconnection Network

Annotation: Two pass routing scheme is described for communication in a multiprocessor system employing a unique-path multistage interconnection network in the presence of faults in the network. It is capable of tolerating all single faults and many multiple faults in all except the first and last stages of the network. The routing scheme is useful for tolerating both permanent as well as inter...

متن کامل

An Adaptive Algorithm for Tolerating Value Faults and Crash Failures

The AQuA architecture provides adaptive fault tolerance to CORBA applications by replicating objects and providing a high-level method that an application can use to specify its desired level of dependability. This paper presents the algorithms that AQuA uses, when an application’s dependability requirements can change at runtime, to tolerate both value faults in applications and crash failures...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000